Recovery of Empty Nodes in Parse Structures

نویسندگان

Denis Filimonov

Mary P. Harper

چکیده

In this paper, we describe a new algorithm for recovering WH-trace empty nodes. Our approach combines a set of hand-written patterns together with a probabilistic model. Because the patterns heavily utilize regular expressions, the pertinent tree structures are covered using a limited number of patterns. The probabilistic model is essentially a probabilistic context-free grammar (PCFG) approach with the patterns acting as the terminals in production rules. We evaluate the algorithm’s performance on gold trees and parser output using three different metrics. Our method compares favorably with state-of-the-art algorithms that recover WH-traces.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parsing and Empty Nodes

This paper describes a method for ensuring the termination of parsers using grammars that freely posit empty nodes. The basic idea is that each empty no& must be associated with a lexical item appearing in the input string, called its sponsor. A lexical item, as well as labeling the no&for the corresponding word, provides labels for a fixed number, possibly zero, of empty nodes. The number of n...

متن کامل

A Simple Pattern-matching Algorithm for Recovering Empty Nodes and their Antecedents

This paper describes a simple patternmatching algorithm for recovering empty nodes and identifying their co-indexed antecedents in phrase structure trees that do not contain this information. The patterns are minimal connected tree fragments containing an empty node and all other nodes co-indexed with it. This paper also proposes an evaluation procedure for empty node recovery procedures which ...

متن کامل

A Common Parsing Scheme for Left- and Right-Branching Languages

This paper presents some results of an attempt to develop a common parsing scheme that works systematically and realistically for typologically varied natural languages. The scheme is bottom-up, and the parser scans the input text from left to right. However, unlike the standard LR(k) parser or Tomita's extended LR(1) parser, the one presented in this paper is not a pushdown automaton based on ...

متن کامل

Effects of Empty Categories on Machine Translation

We examine effects that empty categories have on machine translation. Empty categories are elements in parse trees that lack corresponding overt surface forms (words) such as dropped pronouns and markers for control constructions. We start by training machine translation systems with manually inserted empty elements. We find that inclusion of some empty categories in training data improves the ...

متن کامل

Trace Prediction and Recovery with Unlexicalized PCFGs and Slash Features

This paper describes a parser which generates parse trees with empty elements in which traces and fillers are co-indexed. The parser is an unlexicalized PCFG parser which is guaranteed to return the most probable parse. The grammar is extracted from a version of the PENN treebank which was automatically annotated with features in the style of Klein and Manning (2003). The annotation includes GP...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Recovery of Empty Nodes in Parse Structures

نویسندگان

چکیده

منابع مشابه

Parsing and Empty Nodes

A Simple Pattern-matching Algorithm for Recovering Empty Nodes and their Antecedents

A Common Parsing Scheme for Left- and Right-Branching Languages

Effects of Empty Categories on Machine Translation

Trace Prediction and Recovery with Unlexicalized PCFGs and Slash Features

عنوان ژورنال:

اشتراک گذاری